Protein attributes contribute to halo-stability, bioinformatics approach
نویسندگان
چکیده
Halophile proteins can tolerate high salt concentrations. Understanding halophilicity features is the first step toward engineering halostable crops. To this end, we examined protein features contributing to the halo-toleration of halophilic organisms. We compared more than 850 features for halophilic and non-halophilic proteins with various screening, clustering, decision tree, and generalized rule induction models to search for patterns that code for halo-toleration. Up to 251 protein attributes selected by various attribute weighting algorithms as important features contribute to halo-stability; from them 14 attributes selected by 90% of models and the count of hydrogen gained the highest value (1.0) in 70% of attribute weighting models, showing the importance of this attribute in feature selection modeling. The other attributes mostly were the frequencies of di-peptides. No changes were found in the numbers of groups when K-Means and TwoStep clustering modeling were performed on datasets with or without feature selection filtering. Although the depths of induced trees were not high, the accuracies of trees were higher than 94% and the frequency of hydrophobic residues pointed as the most important feature to build trees. The performance evaluation of decision tree models had the same values and the best correctness percentage recorded with the Exhaustive CHAID and CHAID models. We did not find any significant difference in the percent of correctness, performance evaluation, and mean correctness of various decision tree models with or without feature selection. For the first time, we analyzed the performance of different screening, clustering, and decision tree algorithms for discriminating halophilic and non-halophilic proteins and the results showed that amino acid composition can be used to discriminate between halo-tolerant and halo-sensitive proteins.
منابع مشابه
Comparing various attributes of prolactin hormones in different species: application of bioinformatics tools
Prolactin is mainly secreted by the anterior pituitary and is able to stimulate mammary gland development and lactation in mammalians. Although prolactins share a common ancestral gene encoding, they show species specific characteristics and their efficiency may be different in various mammals. The importance of protein structures of all sequences of this hormone have been studied by various bi...
متن کاملA Bioinformatics Approach to Prioritize Single Nucleotide Polymorphisms in TLRs Signaling Pathway Genes
It has been suggested that single nucleotide polymorphisms (SNPs) in genes involved in Toll-like receptors (TLRs) pathway may exhibit broad effects on function of this network and might contribute to a range of human diseases. However, the extent to which these variations affect TLR signaling is not well understood. In this study, we adopted a bioinformatics approach to predict the consequences...
متن کاملDesigning and Analyzing the Structure of DT-STXB Fusion Protein as an Anti-tumor Agent: An in Silico Approach
Background & Objective: A main contest in chemotherapy is to obtain regulator above the biodistribution of cytotoxic drugs. The utmost promising strategy comprises of drugs coupled with a tumor-targeting bearer that results in wide cytotoxic activity and particular delivery. The B-subunit of Shiga toxin (STxB) is nontoxic and possesses low immunogenicity that exactly binds to t...
متن کاملAnalysis of Missense Mutations of CX3CR1 Gene in Patients with Recurrent Pregnancy Loss Using Bioinformatics Tools
Introduction: Abortion is a common complication that refers to the early termination of pregnancy with the death of the fetus before the 20th week of pregnancy. Previous studies show that many genes are involved in this disease, including the CX3CR1 gene, which is one of the inflammatory response genes in the immune system. The pathogenicity of these variants was determined in this study using ...
متن کاملBioinformatics-Based Prediction of FUT8 as a Therapeutic Target in Estrogen Receptor-Positive Breast Cancer
Abstract Introduction: Estrogen receptor-positive (ER-positive) breast cancer is a subgroup of breast tumors that is more likely to respond to hormone therapy. ER-positive and ER- negative breast cancers tend to show different patterns of metastasis because of different signaling cascade and genes that are activated by estrogen response. Genetic factors can contribute to high rates of metastas...
متن کامل